Open Problem: Fast Stochastic Exp-Concave Optimization
نویسنده
چکیده
Stochastic exp-concave optimization is an important primitive in machine learning that captures several fundamental problems, including linear regression, logistic regression and more. The exp-concavity property allows for fast convergence rates, as compared to general stochastic optimization. However, current algorithms that attain such rates scale poorly with the dimension n and run in time O(n), even on very simple instances of the problem. The question we pose is whether it is possible to obtain fast rates for exp-concave functions using more computationally-efficient algorithms. Consider the problem of minimizing a convex function F over a convex set K ⊆ Rn where our only access to F is via a stochastic gradient oracle, that given a point x ∈ K returns a random vector ĝx for which E[ĝx] = ∇F (x). We make the following assumptions: (i) F is α-exp-concave and twice differentiable; that is, if gx = ∇F (x) and Hx = ∇2F (x) are the gradient and Hessian at some point x ∈ K, then Hx α gxg x . (ii) The gradient oracle has ‖ĝx‖2 ≤ G with probability 1 at any point x ∈ K, for some positive constant G. (iii) For concreteness, we assume the case that K = {x ∈ Rn : ‖x‖2 ≤ 1} is the Euclidean unit ball. An important special case is when F is given as an expectation F (x) = Ez∼D[f(x, z)] over an unknown distribution D of parameters z, where for every fixed parameter value z the function f(x, z) is α-exp-concave with gradients bounded by G. Indeed, this implies that F is itself α-exp-concave (see Appendix A). Given the ability to sample from the distribution D, we can implement a gradient oracle by setting ĝx = ∇f(x, z) where z ∼ D. For example, f(x, (a, b)) = 1 2(a >x − b)2 corresponds to linear regression. In a learning scenario it is reasonable to assume that f(x, (a, b)) ≤ M with probability 1 for some constant M , which also guarantees that f is exp-concave with α = 1/M . Additional examples include the log-loss f(x, a) = − log(a>x) and the logistic loss f(x, (a, b)) = log(1+exp(−b ·a>x)), both are exp-concave provided that a, b and x are properly bounded. The goal of an optimization algorithm, given a target accuracy ε, is to compute a point x̄ for which F (x̄)−minx∈K F (x) ≤ ε (either in expectation, or with high probability). The standard approach to general stochastic optimization, namely the Stochastic Gradient Descent algorithm, computes an ε-approximate solution using O(1/ε2) oracle queries. Since each iteration runs in linear time1, the total runtime of this approach is O(n/ε2). 1. We assume that an oracle query runs in time O(1).
منابع مشابه
Fast Rates for Exp-concave Empirical Risk Minimization
We consider Empirical Risk Minimization (ERM) in the context of stochastic optimization with exp-concave and smooth losses—a general optimization framework that captures several important learning problems including linear and logistic regression, learning SVMs with the squared hinge-loss, portfolio selection and more. In this setting, we establish the first evidence that ERM is able to attain ...
متن کاملFast rates with high probability in exp-concave statistical learning A Proofs for Stochastic Exp-Concave Optimization
This condition is equivalent to stochastic mixability as well as the pseudoprobability convexity (PPC) condition, both defined by Van Erven et al. (2015). To be precise, for stochastic mixability, in Definition 4.1 of Van Erven et al. (2015), take their Fd and F both equal to our F , their P equal to {P}, and ψ(f) = f∗; then strong stochastic mixability holds. Likewise, for the PPC condition, i...
متن کاملFast Algorithms for Online Stochastic Convex Programming
We introduce the online stochastic Convex Programming (CP) problem, a very general version of stochastic online problems which allows arbitrary concave objectives and convex feasibility constraints. Many wellstudied problems like online stochastic packing and covering, online stochastic matching with concave returns, etc. form a special case of online stochastic CP. We present fast algorithms f...
متن کاملA Simple Analysis for Exp-concave Empirical Minimization with Arbitrary Convex Regularizer
In this paper, we present a simple analysis of fast rates with high probability of empirical minimization for stochastic composite optimization over a finite-dimensional bounded convex set with exponentially concave loss functions and an arbitrary convex regularization. To the best of our knowledge, this result is the first of its kind. As a byproduct, we can directly obtain the fast rate with ...
متن کاملBlack-Box Reductions for Parameter-free Online Learning in Banach Spaces
We introduce several new black-box reductions that significantly improve the design of adaptive and parameterfree online learning algorithms by simplifying analysis, improving regret guarantees, and sometimes even improving runtime. We reduce parameter-free online learning to online exp-concave optimization, we reduce optimization in a Banach space to one-dimensional optimization, and we reduce...
متن کامل